Effectively Finding Relevant Web Pages from Linkage Information
نویسندگان
چکیده
This paper presents two hyperlink analysis-based algorithms to find relevant pages for a given Web page (URL). The first algorithm comes from the extended cocitation analysis of the Web pages. It is intuitive and easy to implement. The second one takes advantage of linear algebra theories to reveal deeper relationships among the Web pages and to identify relevant pages more precisely and effectively. The experimental results show the feasibility and effectiveness of the algorithms. These algorithms could be used for various Web applications, such as enhancing Web search. The ideas and techniques in this work would be helpful to other Web-related researches.
منابع مشابه
Artificial Bee Colony (ABC) Approach for Ranking Web Pages
The World Wide Web (WWW) is rapidly growing on all aspects and is a massive, explosive, data resource in the world. In information retrieval approach web Search engines are predominant tools for finding and getting access to the contents of web. The primary goal of Search engine is to provide relevant information to the users according to their needs, Usually Search engines gives large result s...
متن کاملEnhancing Web Search through Query Expansion
Web search engines help users find relevant web pages by returning a result set containing the pages that best match the user’s query. When the identified pages have low relevance, the query must be refined to capture the search goal more effectively. However, finding appropriate refinement terms is difficult and time consuming for users, so researchers developed query expansion approaches to i...
متن کاملInformation Retrieval Issues on the World Wide Web
The World Wide Web (Web) is the largest information repository containing billions of interconnected documents (called the web pages) which are authored by billions of people and organizations. The Web is huge, diverse, unstructured or semi structured, dynamic contents, and multilingual nature; make the effectively and efficiently searching information on the Web a challenging research problem....
متن کاملPrioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملAnalyzing new features of infected web content in detection of malicious web pages
Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 15 شماره
صفحات -
تاریخ انتشار 2003